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ABSTRACT 

Context. The optical and ultraviolet emission lines of galaxies are widely used to distinguish star-forming galaxies (SF) from active 
galactic nuclei (AGNs). However, this type of diagnostic has some associated uncertainties, because AGNs can be of low luminosity 
and/or heavily obscured, and the optical emission lines may be dominated by a stellar component. On the other hand, and despite its 
limitations, X-ray emission can be used as a reliable tracer of luminous AGNs. Several well-studied examples exist where the optical 
diagnostics are indicative of SF galaxy, but the X-ray properties reveal the presence of an AGN. 

Aims. We aim to characterize the nature of galaxies whose optical emission line diagnostics are consistent with star formation, but 
whose X-ray properties strongly point towards the presence of an AGN. Understanding these sources is of particular importance in 
assessing the completeness of AGN samples derived from large galaxy surveys, selected solely on the basis of their optical spectral 
properties. 

Methods. We construct a large sample of 211 narrow emission line galaxies (NELGs) (which have full widths at half maximum 
(FWHMs) emission line < 1200 km/s) from the SDSS-DR7 galaxy spectroscopic catalogue, for which we are able to construct 
a classical diagnostic diagram, [OIIIJ/H^ versus [NII]/H tt (hence z < 0.4), and that are also detected in the 2-10 keV X-ray band 
and present in the 2XMM X-ray source catalogue. This sample offers a large database by which to investigate potential mismatches 
between optical diagnostics and X-ray emission. 

Results. Among these 211 objects, which based on our selection criteria all are all at z < 0.4, we find that 145 galaxies are diagnosed 
as AGNs, having 2-10 keV X-ray luminosities that span a wide range, from 10 40 erg/s to above 10 44 erg/s. Out of the remaining 66 
galaxies, which are instead diagnosed as "star-forming", we find a bimodal distribution in which 28 have X-ray luminosities in excess 
of 10 42 erg/s, large thickness parameters (T = F^-wkeV I 'F[oim > 1) an d large X-ray to optical flux ratios (X/O > 0.1), while the rest 
are consistent with being simply starforming galaxies. Those 28 galaxies exhibit the broadest Hp line widths (FWHMs from ~ 300 to 
1200 km/s), and their X-ray spectrum is steeper than average and often displays a soft excess. 

Conclusions. We therefore conclude that the population of X-ray luminous NELGs with optical lines consistent with those of a 
starforming galaxy ( which represent 19% of our whole sample ) is largely dominated by narrow line Seyfert Is ( NLSls). The occurrence 
of such sources in the overall optically selected sample is small (< 2%), hence the contamination of optically selected galaxies by 
NLSls is very small. 

Key words, galaxies: active, galaxy: fundamental parameters, galaxies: nuclei, galaxies: Seyfert, X-rays: general 

1. Introduction between AGNs and SF galaxies. iKewlev et all ((2001) used 

these same diagrams to derive a purely theoret ical classification 

Emission lines in galaxies convey information about their scheme whi ch was later ex tended by both iKauffmann et all 

emitter such as the power and nature of the underlying exciting (200 | and | tasinska ^ The unde rlying idea is 

source, as well as the geometry, physical condition and chemical that the emission !i nes in norma i SF ga i ax ies are powered by 

composition of the gas, among other properties. Emission line massive stars> so there is an upper i imit to the intensity ratio of 

<3 . data is available for many galaxies, which allow them to be the collisiona i ly exc ited lines with respect to the recombination 

classified depending on their physical nature, as either star lines (such as Hff or U/}) In contrast, the photons from an 

forming (SF) or hosts of active galactic nuclei (AGNs). On AGN extend to yet higher energies and therefore induce more 

the other hand, determining the fraction of galaxies hosting heating ^ implying that optical C ollisic3nall y excited lines should 

AGNs at their centres is an essential step for studies of galaxy be brighter with respect to recombination lines than in the case 

evolution. The way in which AGNs are usually identified among Q f ionization only by massive stars 
the large samples of galaxies in massive optical spectroscopic 
surveys is almost i nvariably in terms of their emission lines. 

iBaldwin et ail (Il98ll) were amongst the first to introduce robust However, a s note d by a number o f authors (e.g., 
emission-line diagnostic ratios able to distinguish between 
star-forming processes and active nuclear emission. These and 
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Severenini etall 120031: iPage et alJ l2006t ICaccianiga etatl 
2007tlTmmpetail 2009). emission lines can be hidden, diluted, 



other diagrams were later used bvlVeiileux & Osterbrockl(ll987l) r masked by the stellar light from the galaxy. This is a 

to derive a semi-empirical classification scheme to distinguish particular problem in a low luminosity AGN (LLAGN), where 

the observed star formation and AGN components may be of 

* e-mail: castello@ifca.unican.es comparable brightnesses. In these cases, the evidence of AGN 
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activity in optical spectra can be weakened in a significant 
number of objects. In other cases, the regions producing the 
narrow and broad line emission characteristic of an AGN, 
may be obscured b y dust in the host galaxy and/or in the 
nuclear regions (s ee Iwasawa et al. 1993t IComastri et al.| [2002; 
iRigbv et all 120061 ICivano et alj|2007l) . Such objects would be 
optically classified as either SF galaxies or HII regions, on the 
basis of emission-line diagnostic diagrams. Therefore, emission 
line diagnostics are not always a reliable means of detecting 
AGNs within galaxy samples. 

As an alternative tracer, X-ray emission is a virtually 
universal feature of the AGN phenomenon. Accretion onto the 
super massive black hole (SMBH) produces X-ray emission 
by means of inverse Compton scattering by a hot electron 
corona of ultraviolet (UV) photons emitted from the accretion 
disc. Hard X-rays are not strongly attenuated along their 
path from the central engine, except by extremely high gas 
columns of > 2 x 10 24 cm 2 , which occur in the so-called 
Compton thick sources. The latter are believed to make a 
significant contribution to the infrared (IR) and submillimeter 
backgrounds, because of the absorption of a large fraction of 
their continuum at shorter wavelengths, and its reradiation at 
longer wavelengths. This gas column density is equivalent to 
several tens of magnitudes of optical extinction, for Galactic 
gas to dust ratios. Contemporary X-ray observatories such as 
NASA's Chandra and ESAs XMM-Newton are sensitive to 
photon energies of up to ~ 10 keV, thus detection by these 
facilities is a very robust indicator of AGN activity. However, 
in spite of this advantage of X-ray selection there is no single 
method capable of selecting a complete sample of AGNs. On 
the one hand, dust is the main problem for optical selection, 
particularly for edge-on disc galaxies. On the other hand, X-ray 
selection is biased against sources in which the X-ray emission 
is heavily absorbed and/or Compton-scattered by dense gas 
clouds close to the central engine. Thus, no single method is 
able to identify all the AGNs found by other methods. 

When both X-ray data and optical spectra are available for 
the same galaxy, galaxies optically classified as SF, may be 
found to have high X-ray luminosities, in excess of the most 
luminous SF galaxies known in the local Universe, by over an 
order of magnitude (> 10 42 erg s _1 ). The o rigin of this classi- 
ficatio n discrepancy is not fully understood. iTrouille & Bargerl 
(1201 Ol) compared the optical classifications with the X-ray 
properties of a complete sample selected from three Chandra 
fields. They found that the optical diagnostic diagram misiden- 
tified 20 - 50% of their X-ray selected AGNs, in the sense 
that many X-ray AGNs were misclassified as SF galaxies. 
lYan et alT d201 ll) reported another case of misclassification. 
Their analysis of the relationship between the X-ray properties 
and optical emission lines suggests that a large fraction of the 
X-ray emitting AGNs would not have been identified using 
emission line diagnostic diagrams. They also found that there 
are indications of large classification discrepancies between 
X-ray and optically selected AGN samples at z ~ 1 . 

Understanding the physical cause of why a fraction of 
galaxies exhibit emission line diagnostics compatible with star 
formation, yet have X-ray properties that are indicative of an 
AGN, is of considerable importance. It must relate to the de- 
mographics of AGNs, black hole growth, and galaxy evolution. 
To address these issues, we present a study of a large sample 
of narrow emission line galaxies (NELGs) from the Sloan 



Digital Sky Survey (SDSS), whose X-ray emission properties 
are all available from the 2XMM X-ray source catalogue. We 
devote special attention to those objects optically classified 
as SF galaxies, but with 2-10 keV luminosities in excess of 
10 42 erg/s. We conduct a detailed analysis of this population, 
using other optical spectral features, e.g. the full width at 
half maximum (FWHM) of the Hg emission line, as well as 
the X-ray spectral properties. We discover that this optically 
misclassified population, which represents over 10% of our full 
sample of narrow emission line galaxies with detected X-ray 
emission, consists of narrow line Seyfert 1 galaxies (NLSls). 
If, however, all narrow emission-line galaxies for which there is 
XMM-Newton X-ray coverage (regardless of whether they have 
an X-ray detection or only an upper limit) were considered, the 
fraction of misidentifications would be only 1.5 — 5%. 

The structure of the paper is as follows. In Section [2] we se- 
lect the sample and describe both classification methods, high- 
lighting the disagreements between them. Our modelling of the 
X-ray data is presented in Section [3] which focusses on the gen- 
eral spectral features found for the whole sample. Finally, in 
Section [4] we discuss our results and their implications. We as- 
sume a concurrence cosmology with Ho = 70 km s _1 Mpc -1 , 
Q. A = 0.73, and Q M = 0.27. 



2. Optical and X-ray properties 

To build a sample of galaxies having both X-ray data and optical 
spectra and showing prominent narrow emission lines and a lack 
of broad components, we performed a cross-correlation between 
the 2XMMi catalogue and the spectroscopic SDSS DR7 cata- 
logu43- The SDSS DR7 is the seventh major data release of the 
Sloan Sky Survey. The SDSS spectroscopic data has sky cover- 
age of ~ 8200 deg 2 , a spectral coverage from 3800 A to 9200 A, 
and a spectral resolution ranging from 1850 to 2200. 

We filtered the resulting large sample as described below. 
Our final sample is composed of 211 NELGs, with Hg line 
widths ranging from 140 km s up to 1200 km s _1 . The X-ray 
selection criteria resulted in all spectra having a minimum of 30 
counts above 2 keV in at least one detector, ie. PN, MOS1, or 
MOS2. 



2.1. Sample selection 

Our sample selection process consisted of five stages: 

1 . Identification of NELGs. The first step was to obtain a 
large sample of NELGs from the SDSS DR7 catalogue. We 
adopted an operational line width cut-off FWHM(H /3 )< 1200 
km s , to reject galaxies with broad emission lines. This 
value was chosen by taking into account the strongly 
bimodal distribution of measured H^ FWHM valu es for an 
emission-line galaxy sample (see lHao et all [20051 Fig. 6). 
This indicates that there is a natural separation between 
broad and narrow line AGNs. Our selection results in a 
sample composed of objects of different types including: 
type 2 AGNs with a broad range of luminosity, obscured 
AGNs, galaxies whose emission is not dominated by nuclear 
activity and are classified as normal star-forming galaxies, 
and type 1.9 Seyferts, which exhibit only a weak broad 



1 The SDSS DR7 data archive server is available at 
http://www.sdss.org/dr7 
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component of H a (lOsterbrockll 198 lb . 

2. BPT diagram as indicator of SF/AGN activity. The stan- 
dard method used to classify galaxies as either SF or 
AGNs, is based on their emission-line ratios. Among all 
available combinations of emission lines, in this work we 
used romi/l5007/H fl versus [NH]A6584/H a dBaldwin et alJ 
I1981L often referred to as the BPT diagram), because it is 
the one that most clearly dist inguishes between these two 
classes ( Stasiri ska et alJ l2006h . In the BPT diagram (see 
Figure [TJ, the AGNs occupy the region above and to the 
right of the borderline, whilst SF gal axies are found to 
the bottom left of the parameter space dKewlev et alJl200H 
iKauffmann et alj20 03). The reason for using these emission- 
line ratios is twofold: first, the lines used to compute each ra- 
tio are very close together in wavelength, and consequently 
the line ratios are largely insensitive to dust; second, the 
[OIII]/15007 emission line luminosity is often used as an 
isotropic indicator of AGN activity. This assumes that the 
[OIII] emission is an unbiased and orientation-independent 
measure of the ionizing flux from the AGNs whereas X-ray 
photons from the compact nucleus may be strongly attenu- 
ated in the plane of the dusty torus. Therefore we only se- 
lected objects in which the four emission lines of H^, H a , 
[OIII]/15007 , and [NII]/16584 were detected. On the basis 
of this selection criteria all of them are at redshift z < 0.4. 
Given the typical signal-to-noise (S/N) ratio and the instru- 
mental resolution of the SDSS spectra, we omitted objects 
that had an observed FWHM for any of the four emission 
lines smaller than 70 km s _1 , as they are likely to be spurious 
detections or poorly detected lines. Applying these criteria, 
we selected about 150000 nearby NELGs. 

3. Cross-correlation between 2XMMi DR3 & SDSS DR7. To 
identify possible X-ray counterparts, we analysed XMM- 
Newton observations covering the sky positions of these 
NELGs. A total of 1729 have some X-ray exposure time, 
although in the majority of cases only upper limits to their 
X-ray emissivity were found. We crosscorrelated this par- 
ent sample with the Incremental Second X MM-Newton 
Seren dipitous Source Catalogue 2XMMi-DR3 dWatson et alJ 
2009) released in April 2O1C0 applying a matching radius 
of 3 arcsec. This radius is chosen as a compromise between 
allowing for some positiona l error, and minimiz ing the prob- 
ability of spurious matches dWatson et al.ll2009l) . We consid- 
ered only X-ray sources with a 0.2- 12 keV European Photon 
Imaging Cameras EPIC detection likelihood above 3<r in at 
least one camera. 

4. Lx as an indicator of AGN activity. Our primary goal is to 
characterize the populations of NELGs by comparing their 
optical classification with their X-ray properties. The stan- 
dard technique employed to identify AGNs in emission line 
galaxies is via an empirical X-ray luminosity threshold at 
Lx > 10 42 erg s . This is a very conservative limit, based 
on the fact that very few local starforming have higher X-ray 
luminosities -wi th a few possible exceptions e.g. NGC3256 
(Lira et al ■1 12002b . While very luminous AGNs can be unam- 
biguously identified in almost any energy band, AGNs be- 
come progressively more challenging to identify at lower lu- 
minosities when their emission may be equal to, or even less 
than that from the host galaxy. The hard X-ray (2-10 keV) 
luminosity (Lx) is a good indicator of AGN activity, because 
the X-ray spectra of SF galaxies are typically softer than 



2 Available from http://xmm.esac.esa.int 



those of AGNs. However, lower luminosity X-ray sources 
with hard spectra, can be either AGNs or arise from high 
mass X-ray binaries. Hence there is a problem in using X- 
rays alone to distinguish between low luminosity or obscured 
AGNs, and a population of high mass X-ray binaries in star- 
forming galaxies. One way to distinguish between these pos- 
sibilities is to apply the empirical limit in X-ray luminosity 
of starburst galaxies at around Lx ~ 10 41 erg s . Therefore, 
a factor of ten above this value, it is assumed that the X- 
rays probably arise from accretion onto a SMBH. We have 
adopted this value as the representative dividing line for star- 
forming activity: all objects with an X-ray luminosity higher 
than 10 42 erg s _1 are assumed to host an active nucleus. 
Galaxies with lower X-ray luminosities are consistent with 
SF galaxies. But it is possible that low luminosity X-ray 
sources may still be weak AGNs, e.g. low-ionization nuclear 
emission-line regions (LINERs), or be are heavily obscured. 
Other information, such as X-ray spatial extent or variabil- 
ity, is needed to confirm the origin of the emission. To select 
galaxies that might potentially host an AGN, we required our 
sources to have a well-defined count rate in the 2-12 keV 
energy range, i.e. we considered only X-ray sources with a 
2-12 keV European Photon Imaging Cameras (EPIC) de- 
tection likelihood above 3cr in at least one camera. This re- 
quirement resulted in X-ray spectra with a minimum of 30 
counts in at least one detector. 
5. Final sample. Finally, we performed a visual inspection of 
the optical data in each of these SDSS/2XMMi pairs to con- 
firm that all the matches were indeed genuine. Special care 
was taken to examine sources that showed some signs of 
H„ and/or Hp broad emission lines, as well as spurious 
sources. After inspection of the SDSS spectra, we excluded 
29 objects. These sources showed either strong reddening or 
low S/N in the blueward part of the Hp line or only weak 
broad H„ lines (e.g. were Seyfert 1.9's). In some cases, we 
could detect weak broad H a and Hg lines (e.g. a Seyfert 
1.8). After removing these sources, our final sample con- 
tained 211 galaxies, that had only narrow emission lines and 
reliable X-ray flux detections in the 2-12 keV band. 

2.2. Optical classification versus X-ray emission 

The BPT diagrams have been used to infer whether the gas in 
a given galaxy is excited by star formation or radiation from an 
accretion disc around a central SMBH and has become one of 
the major tools for the clas sification and a nalysis of emission 
line galaxies in the SDSS dYork et al.ll2000h . To be conserva- 
tive in our analysis, we adopted the dividing line betwe en SF 
and AGN galaxies presented by Kauffman n et alJ (|2003, here- 
after Kauf03). In that work, a refined optical classification was 
obtained, based on a combination of stellar population synthesis 
models plus detailed self-consistent photoionization models, in 
order to create a theoretical starburst line projected onto the BPT 
diagram. This is given by 

l o g[ OIII],5007 ^ = log([Nn] , 6 ° 58 6 4 /i/g) _ . Q5 + 13. (1) 

iKewlev et alJ (1200 lh used a different separation criterion 
between SF and AGN galaxies in the BPT diagram, which lies 
above and to the right of the Kauf03 line, defining a region often 
known as the LINER/transition region where sources exhibit 
both AGN and starburst activity. If we adopted the Kewley et al. 
(2001) borderline, then the fraction of X-ray luminous NELGs 
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classified as S F galaxies would be significantly higher, at least 
twice (see e.g. iJackson et al.ll2012h . where their three sources 
fall between both borderlines). In this paper, we prefer to 
focus on the X-ray luminous NELGs that are uncontroversially 
classified as SF galaxies using the BPT diagram, hence we adopt 
the Kauf03 criterion. Following this criterion, we can directly 
obtain an optical classification for our sources: star-forming 
galaxies are those lying below the Kauf03's line, i.e. optically 
classified SF or BPT-SF, and conversely those located above the 
line are classified as AGN, i.e. as either an optically classified 
AGN or BPT-AGN. 

To see how the optical classification compares with X-ray lu- 
minosities, we must first obtain the hard X-ray intrinsic luminos- 
ity of each source. We used the X-ray fluxes and spectroscopic 
redshifts^ to calculate the rest-frame 2-10 keV luminosities 
(Lx) assuming an X-ray spectrum in the form of a power law 
modified by Galactic absorption with continuum spectral slope 
F = 1 .7; we then corrected these luminosities for the Galactic ab- 
sorption to compute the intrinsic luminosities, but did not correct 
for any possible intrinsic absorption. Figure Q] shows the BPT 
diagram where the dashed line is the Kauf03 criterion and the 
points change in both form and colour according to their value 
of Lx to highlight the disagreements between the optical-based 
and X-ray-based classifications. From the optical diagram and 
according to the values of Lx the sample was split into four sub- 
samples (see TableQ]): 

- Weak-AGN subsample: consisting of 62 sources classified 
as AGNs according to the BPT diagram notwithstanding 
their low luminosities, not exceeding 10 42 erg s . 

- Strong-AGN subsample: including 83 NELGs identified as 
BPT- AGNs that have in turn luminosities exceeding 10 42 
erg s . 

- True-SF subsample: consisting of 38 sources that were clas- 
sified as SF according to both the BPT diagram and Lx cri- 
terion. 

- Missing-AGN subsample: involves 28 objects classified as 
BPT-SF that notwithstanding have a Lx> 10 42 erg s" 1 . 

This classification clearly implies that there is a mismatch 
between the optical-based and X-ray-based classifications in 
the missing-AG N and weak-AGN subsamples (also found by 
lYan etalJl20TTh . On the one hand, there are several explana- 
tions of the low luminosity emitted by weak-AGNs. A signifi- 
cant fraction of the population of AGNs in the local Universe 
displ ays a low X -ray luminosity, not exceeding 10 42 erg s _1 . 
(see iBarthl 120021 LLAGN). In particular, low-ionization nu- 
clear emission-li ne regions (LINERs) were originally defined by 
lHeckmanldl980l) as a subclass of these LLAGNs, whose optical 
spectra are dominated by strong low ionization lines and much 
weaker higher ionization lines classical AGNs. According to 
lHeckmanl([l980l) . LINERs are galaxies that satisfy [0\\]Xil21 > 
[OIII]/15007 and [OI]^6300/[OIII]^5007 > 1/3. According to 
these criteria, we classified 8 (13%) sources in our weak-AGN 
subsample as pure LINERs and an addit ional 18 (29%) as weak- 
ly OI] L INERs. The latter fully satisfy iFilippenko & Terlevichl 
(h992l) 's definition (i.e. [OII]^6300/H ff < 1/6). On the other 
hand, the high values of the luminosity emitted by the sources in 
our missing-AGN subsample suggest that these sources do con- 
tain AGNs even though they lie beneath Kauf03's line implying 



that optical AGN signatures are lacking and signs of star forma- 
tion, such as strong H a and ILj lines, are clearly visible. The 
nature of this misclassified population is discussed throughout 
this paper. 

2.3. Optical versus X-ray properties 

After discussing the optical classifications of our NELG sample 
and the mismatches with the X-ray luminosities for some 
objects, we now compare their optical and X-ray properties in 
the context of three parameters in an attempt to provide clues 
about the nature of the source populations within the complete 
sample of NELGs. 

To obtain the most basic X-ray spectral information, we per- 
formed a Hardness Ratio (HR) analysis using EPIC-pn data. We 
adopted the standard hardness ratio, defined as 



HR 



H-S 
H + S 



(2) 



where S and H are the PSF and vignetting-corrected count rates 
in the 0.5 - 2 keV and 2 - 4.5 keV energy bands, respectively. A 
HR analysis is much simpler than a complete spectral analysis 
and is often the only X-ray spectral information available for the 
faint sources in the XMM-Newton catalogue. We note that the 
X-ray selection criteria resulted in a minimum count threshold 
of around 30, hence a proper X-ray spectral analysis could not 
be performed for a number of our sources. The HR parameter 
is an approximate indicator of the intrinsic X-ray spectral 
shape, which is also sensitive to the level of absorption. An 
unabsorbed X-ray spectrum has typically a lower HR than an 
absorbed one. Alth ough this correlation is relatively weak, and 
redshift-dependent dTrouille et al.1 120091) . the vast majority of 
our missing-AGNs have a low HR that is consistent with being 
unabsorbed as shown in Figure [2] 

An alternative method for evaluating absorption is to mea- 
sure the X-ray luminosity, and compare this with an isotropic 
indicator of the intrinsic power of the AGN. Assuming that the 
unified AGN model is correct, in absorbed sources the X-ray 
flux iis attenuated with respect to this isotropic indicator by 
an amount related to the absorbing column density. Taking the 
reddening corrected [OIII]/15007 luminosity as an isotropic 
indicator of the source nuclear str ength, we calculated the ratio 
of the hard X-ray to [OIII] fluxes (iBassan et alJI 19991 hereafte r 



3 Optical and X-ray parameters are taken from SDSS-DR7 and 
2XMMi-DR3, respectively. 



thickness parameter or T). According to Bassani et al. (1999), 
Seyfert 1 galaxies lie in the range 1 < T < 100, Compton-thin 
Seyfert 2 galaxies, in the range 0.1 < T < 10, and Compton- 
thick Seyfert 2 galaxies at T < 0.1. To estimate the thickness 
parameter, T = F C HX I ^[ouiy X-ray and [OIII] fluxes were 
corrected fo r Galactic absorptio n and reddening, respectively. 
We used the iBassani et alJ d 19991) relation to derive F? onr ^ from 

((H Iff 1 \2.94 
° 3 q — J which 

assumes an intri nsic Balmer decrement equa l to 3 .0 as predicted 
in the NLR (see lOsterbrock & Ferlandll2006l) . 

Finally, a useful parameter used to discriminate between 
different classes of X-ray sourc es is the X-ray-to-optical flux 
ratio (see iMaccacaro etal1ll98 8, hereafter X/O). In this paper, 
we defined the X-ray to optical flux ratio using the observed 
X-ray flux in the 0.5 - 4.5 keV energy range and the optical 
r(SDSS) band flux (for the appropriate conversion factors see 
Fukugita etaill 19951) . X-ray selected AGNs (of both type 1 and 
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Fig. 1. Emission line diagnostic diagram (BPT diagram) for the whole sample of NELGs. The curve separating the AGNs from 
the non-AGNs (bottom left) zone is taken from Kauffman n et al.l d2003l) . Symbols change in both form and colour according to the 
hard X-ray luminosity, Lx- To avoid confusion, only the mean errors are reported (bottom right). The estimated errors increase with 
decreasing log[NII]/H a . 



Table 1. Classification definitions of NELG objects. 



Sub-sample 


Optical 1 


X-ray 1 


N 


T> 10 

% 


x/o> 0.1 

% 


FWHMty > 600km/s 

% 


BPT-SF true-SF 

missing-AGN 


SF 
SF 


SF/AGN 
AGN 


38 
28 


0.10 
0.93 


0.00 
0.96 


0.03 
0.89 


BPT-AGN weak-AGN 
strong-AGN 


AGN 
AGN 


SF/AGN 
AGN 


62 
83 


0.03 
0.30 


0.08 
0.79 


0.05 
0.19 



Notes. Col. 1: sub-sample name; col.2: classification of the source using the Kauf03's line as a separation criterion between SF and AGN activity; 
col. 3: X-ray based classification according to hard X-ray luminosity, L x ; col. 4: total number of sources in the subsample; col. 4(5): the fraction of 
sources in the subsample that display a thickness parameter (X-ray-to-optical flux ratio) higher than 1 (0.1) as expected for a typical AGN; col. 5: 
the fraction of sources that exhibit an H^FWHM larger than 600 km/s. 



type 2) have typical X/ O flux ratios in the range between 0. 1 and 
10 (see lFiore et al.l l2003). An X/O ratio of above 10 is typical of 
obscured AGNs at high-z and high-luminosity (type 2 QSOs), as 
well as high-z clusters of galaxies and extreme BL Lac objects. 
Values of X/O below 0.1 are found in coronal emitting stars, 
normal galaxies (both ea rly-type and star-formi ng), and nearby 
heavily absorbed AGNs (iDella Ceca et al.ll2004h . 

Thus, we should expect that the thickness parameter and the 
X/O values fall within the typical range of values for BPT-SF 
galaxies, i.e. T < 1 and X/O < 0.1 (respectively). We expect 
correspondingly that BPT-AGN will fall outside of this range. 
Figure [2] shows the combined information provided by these 
three parameters: X/O versus HR and T versus HR. We have 
used different symbols to denote different ranges of Lx'. filled 
and empty symbols denote the optical classification, identified 
as BPT-AGN and BPT-SF respectively. Whilst nearly all of the 
X-ray-based AGNs (missing-AGN and strong-AGN subsample, 
see Table [TJ have typical AGN values for both Xj O and T pa- 
rameters, the values for the true-SF and weak-AGN subsample 
are more consistent with being SF galaxies, for which Xj O and 
T are lower than 0.1 and 1, respectively. Despite this, we cannot 



define a range of values that isolate AGN from the rest of the 
sources, either by using T or X/O, i.e. a NELG classified as 
either a SF galaxy or an AGN (by using either optical-based 
or X-ray-based criteria) does not occupy a definite region in 
these parameter spaces. Analogously, hardness ratios do not 
clearly cluster around different values for different subclasses of 
objects. In general, that one would expect SF galaxies to have 
an X-ray spectrum dominated by a thermal component that is 
softer than the typical power-law spectra exhibited by an AGN. 
However, the mix of BPT-SF and BPT-AGN galaxies do not 
show a clear trend in their hardness ratios. Thus, we cannot 
establich a clearly defined criterion to differentiate AGN from 
SF, in the context of the three analysed parameters. 
However, we find that the log T as well as the log X/O dis- 
tributions display bimodal shapes for the BPT-SF population 
opening the possibility that the emission of the missing-AGN 
population has a different nature (see Figure |2). 
Figure [3] shows the X-ray luminosity as a function of the 
Hp FWHM for these two optical populations. From this figure, 
it is evident that this bimodal feature of the BPT-SF population 
is directly linked to the values of Hg FWHM. There is an 
almost one-to-one correspondence for the NELGs diagnosed as 
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Fig. 2. Optical and X-ray properties. Upper two panels: T = F c H xl^[oiii] versus HR (on the left) and the X-ray-to-optical flux ratio 
distribution as a function of HR (on the right). Different symbols mark X-ray luminosity and the filled/empty symbols represent 
the optical classification (BPT-AGN and BPT-SF, respectively). In both plots, the point with the error bars is a fake source to 
represent the error mean value of the parameters. Bottom two panels: Distribution of the thickness parameter (on the left) and the 
X/ O distribution (on the right) for the two optical subsamples. 



BPT-SF galaxies, between the FWHM of their Fig line and the 
X-ray luminosity. Among this sample of 66 BPT-SFs, we indeed 
found that roughly all these with Lx< 10 42 erg s exhibit an 
Hp FWHM< 600 km s , while all the more X-ray luminous 
objects (which should contain an AGN) have Hp FWHMs of 
between 600 and 1200 km s . However, this division based on 
Hp FWHM does not apply to the BPT-AGNs, where the vast 
majority are smaller than ~600 km s _1 , independent of the value 
of L x . 



2.4. An overview of the missing-AGN subsample 

On the basis of the estimated upper limits to the 2-10 keV 
luminosity given by FLFX0 for objects that were identified as 
NELGs but lacked X-ray detections, we identified another 1207 
galaxies that were classified as SF based on their position in the 
BPT diagram. However, only 5% (60/1207) of them could be 
missing-AGN candidates, i.e. those for which the upper limit to 
their 2-10 keV exceeds 10 42 erg s _1 . 

Therefore the missing-AGN candidates represent only between 
2% and 7% (28 + 60), of the BPT-SF population (66 + 1207) 



4 FLIX: upper limit server for XMM-Newton data provided by the 
XMM-Newton Survey Science Centre. 
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Fig. 3. X-ray luminosity as a function of the Hp FWHM. The ver- 
tical dotted line marks the threshold of FWHM(H /J )=600 km/s, 
while the horizontal dashed line corresponds to Lx = 10 42 erg/s. 
Different symbols are related to the optical classification: BPT- 
AGN as filled circles and BPT-SF as empty triangles. 



6 



Castello-Mor et al.: X-ray luminous, optically SF galaxies 



and therefore they do not represent a major problem in terms of 
incompleteness. In terms of the total sample of NELGs covered 
by X-ray observations, the missing-AGN represent between 
1.6% and 5% of the entire sample. However, the nature of the 
missing-AGN subsample is poorly understood and needs to be 
explored further. 

In the above section we have described the X-ray and optical 
spectral properties used to explore the nature of the elusiveness 
of optical signatures in the misssing-AGN subsample. Similar 
studies have been performed previously focussing on the nature 
of the so-called elusive AGNs, i.e. sources that show no signs 
of AGN activity in the optical regime, but display signs of 
AGN activity in the X-ray band. One possibility is that they 
could be mildly /heavily obscured AGNs in which star formation 
dominates the optical emission-line ratios. The obscuration of 
the narrow-line region may well be caused by gas and dust 
close to the galactic nucleus and therefore, comparing the 
X-ray to f QUI] emission should reveal th e level of absorption 
(Maiolino et al)ll998tlBassani et alj|1999h . However, given that 
the thickness parameter is in the range T > 10, the hydrogen 
column density would have to be < 10 23 crrT 2 and therefore 
obscuration by the torus is not very likely to be the cause of the 
elusiveness of optical AGN signatures in these sources. 

Another possibility is that the sample contains composite 
objects, i.e. those hosting both star formation and an AGN. 
On the basis of the relation between Ly (2 -10 keV) and Lu a 
for SF galaxies dRanalli et all 120031; iKennicutll IT998D . L H JL X 
should be greater than 1, assuming A v < 2. Whilst type 1 AGNs 
have ratios lower by two orders of magnitude, composite galax - 
ies are expected to have intermediate ratios (lYan et al.ll201 ll) . 
Figure |4] shows the distribution of log (L^/LxX where most of 
the Ln a /Lx ratios seem to lie in-between both extremes as ex- 
pected in the composite galaxy range. This opens the possibil- 
ity that the missing-AGN population could be composite objects 
having both star formation and active nuclei, although largely 
consistent with being AGN. 
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Fig. 4. Distribution of the Ln a /Lx for missing-AGN sub-sample. 

The high values of both X/O and T, which are two orders of 
magnitude higher than the average in SF galaxies and those typ- 
ical of Seyfert 1, the low hardness ratio and the quite high values 
for the Hp FWHM make our missing-AGN sources very likely 
to be narrow line Seyfert 1 (NLS1), which are believed to lie 
in the starburst region of the BPT diagrams. The NL S 1 galax- 
ies represent a subclass (lOsterbrock & Poggelll985|) of type 1 
AGNs that manifest a distinctive ensemble of properties. They 
are AGN with optical spectral properties similar to those of 



Seyfert 1 galaxies, except for recombination lines that are only 
slightly broader than forbidden emission lines. Studies of NLS 1 s 
have identified many peculiar properties that extend well beyond 
a pure line-width-based distinction. Distinctive features in opti- 
cal spectra of NLS Is, are the low values of the [OIIIl/H^, and 
the often strong permitted blended Fell emission. Out of the 28 
missing-AGNs, the Fe II multiplets were detected in 23 objects 
at the > 3cr level. Among the rest of the subsample, there is 
another rare class of NLS 1 s that do not exhibit strong Fe II mul- 
tiplets. For the three objects with very narrow Balmer lines (see 
Figure [3), there is a prominent He II broad emission line, which 
ensures their classification as type 1 AGNs. For two additional 
sources, the SDSS spectra were too noisy to yield reliable mea- 
surements of either Fe II multiplets or He II, and in addition they 
have evidence of high reddening. 

The relative strength of the Fe II multiplets is usually expressed 
as the flux ratio of Fe II to Hp: R 451Q = FelIAAAA3A - 4684/i/s, 
where Fe II AA4434 - 4684 denotes the flux of the Fe II mul- 
tiplets integrated over the wavelength range of 4434 - 4684 A 
after subtracting the local underlying continuum and the He II 
/14686 emission line. Figure [6] shows the distribution of the rel- 
ative strength of the Fe II multiplets, ^4570, for the missing- 
A GN subsamp l e, whi ch is compared with the distribution given 
by IZhou et all d2006l) . The average is < R 457() >= 0.88 and 
the 1 cr dispersion is 0.5, which is consistent with the Zhou's 
NLS1 sample: the probability that both distributions come from 
the same distribution is ~89% according to the Kolmogorov- 
Smirnov test. The average is significantly lar ger than the typical 
value of /? 4 57o ~ 0.4 found in normal AGNs (Ber geron & Kunthl 
1984), bolstering again the idea that these two populations are 
probably of a different nature. 

3. X-ray spectral analysis 

We carried out an X-ray spectral analysis of the BPT-SF popu- 
lations, consisting of the two subsamples of missing-AGN and 
true-SF subsamples. We recall that our main aim is to understand 
the nature of the missing-AGN subsample, which are optically 
classified as SF but have Lx in excess of 10 42 ergs" 1 that are in- 
dicative of an AGN. The results of the previous section (T, X/O, 
HR, Hp FWHM, and ^4570) lead us to propose that these objects 
are good candidate NLS Is. Thus, the missing-AGN and true-SF 
subsamples were analysed as samples of different natures. 

3.1. Data reduction and spectral analysis 

All objects presented here were observed with XMM-Newton 
between 2001 June and 2007 D ecember. The Europ ean Photon 
Imaging Cameras (EPIC) pn (IStriider et al.l 120011) and MOS 
(iTurneret all 1200 ll MOS1 and MOS2) were operated in full 
frame imaging mode during all the observations. The XMM- 
Newton data of some objects were previously presented in 
the literature (see Table |2). For a fully homogeneous analysis 
enabling robust conclusions, we reanalysed the XMM-Newton 
spectra of these objects, in exactly the same way as for the 
objects whose XMM-Newton data are presented here for the first 
time. 

We chose to use EPIC-pn data as it covers a larger effective 
area resulting in a higher S/N. The observation data files (ODFs) 
were processed to produce calibrated event lists using the 
Science Analysis System (SAS 10.0.0). We extracted the source 
spectra using the good EPIC-pn events in circular regions of 
radii ranging from 12 to 30 centred on the source position. 
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Table 2. Summary of some observational parameters. 



ID 


SDSS DR7 


2XMMi Catalogue 


texp 


EPIC-pn 


Redshift 


Nn.Oai 


Lx 


Notes 




SDSS... 


2XMM. . . 


Obs ID 


ks 


Counts 3 




xlO 20 cm" 2 


xl0 42 erg s 1 




"Missing AGN" sample 


26 


J141519.49-003021.5 


J141519.4-003021 


0145480101 


13 


1781 ± 44 


0.135 


3.28 


8.27 


a 


42 


J010712.03+ 140844.9 


J010712.0+140844 


0305920101 


16 


4502 + 69 


0.077 


3.41 


2.81 


b 


53 


J135724.52+652505.9 


J135724.5+652506 


0305920501 


1.7 


931 + 31 


0.106 


1.38 


6.85 


b 


63 


Jl 11031.61+022043.2 


iJl 11031.6+022043 


0504101801 


8 


344 + 19 


0.080 


3.73 


2.30 




65 


J114008.71+030711.4 


Jl 14008.7+030710 


0305920201 


34 


23860 ± 156 


0.081 


1.93 


3.91 


b 


71 


J124635.24+022208.7 


J 124635. 3 +022209 


0051760101 


4 


31746 ± 179 


0.048 


1.85 


12.51 


c 


100 


J221918.53+120753.1 


J221918.5+120753 


0103861201 


8 


19684 ± 141 


0.081 


5.03 


10.03 


d 


104 


J092247.02+5 12038.0 


J092246.9+5 12037 


0300910301 


6.3 


12029 ±111 


0.160 


1.32 


18.90 




1 1 1 


J094240.92+480017.3 


J094240.9+480017 


0201470101 


14 


424 ± 24 


0.197 


1.20 


5.87 




125 


J133141.03-015212.4 


J133141.0-015212 


0112240301 


24 


2548 ± 52 


0.145 


2.39 


11.90 




129 


J081053.75+280610.9 


J081053. 8+280611 


0152530101 


16 


986 ± 33 


0.285 


2.93 


18.17 




161 


Jl 23 1 26.44+ 105 1 1 1 .3 


J123126.4+105111 


0145800101 


7.7 


269 + 18 


0.304 


2.14 


6.86 




* 163 


J123748.49+092323.1 


iJ123748.5+092323 


0504100601 






0.125 


1.48 






171 


J093922.90+370943.9 


J093922.9+370942 


0411980301 


4 


2857 ± 54 


0.186 


1.22 


27.89 




191 


J155909.62+350147.4 


J155909.6+350147 


0112600801 


11 


80531 +285 


0.031 


2.11 


8.22 




195 


J103438.59+393828.2 


J103438.6+393828 


0109070101 


28 


21857±214 


0.043 


1.31 




c.c 


203 


J124013. 82+473354.7 


J124013. 8+473355 


0148740501 


5.7 


804 ± 29 


0.117 


1.32 


1.03 




204 


J124058.45+473302.0 


J124058.3+473302 


0148740501 


5 


192 ± 15 


0.367 


1.33 


35.88 




214 


Jl 12405. 15+061248.8 


J112405. 1+061248 


0103863201 


5 


786 + 29 


0.272 


4.61 


36.76 




241 


J075216.55+500251.3 


J075216.4+500251 


0151270201 


7.7 


620 ± 26 


0.263 


5.17 


57.45 




275 


J145 108.76+270926.9 


J145 108.7+270926 


0152660101 


18 


91984 ± 305 


0.065 


2.70 


20.35 


f 


302 


J102812.67+293222.8 


J102812.6+293222 


0301650401 


8.4 


76 + 13 


0.287 


1.91 


3.57 




318 


J122230.71+ 155547.9 


J122230.7+ 155547 


0106860201 


8.6 


238 + 17 


0.367 


1.99 


16.00 




329 


J140621. 89+222346.5 


J140621. 8+222347 


0051760201 


3.1 


1911 ±44 


0.098 


2.05 


3.24 


eg 


*335 


J014856.95+135451.8 


J014856.9+135450 


0094383401 






0.220 


4.90 






338 


J082912.67+500652.3 


iJ082912.8+500652 


0303550901 


2.2 


442 ± 22 


0.043 


4.07 


2.51 


b 


355 


J131718.58+324035.6 


J131718.6+324036 


0135940201 


10 


161 ± 13 


0.061 


1.17 


2.27 




357 


J134235.66+261534.0 


J134235.6+261534 


0108460101 


26 


867 ± 30 


0.064 


1.03 


2.59 





Sub-sample of optically-classified SF galaxies 



8 


J140919.94+262220.1 


J140920.0+262219 


0092850501 


35 


163 ± 20 


0.059 


1.40 


6.0 


56 


J095848.66+025243.2 


J095848.6+025243 


0203362101 


54 


259 ± 32 


0.079 


1.83 


36.0 


79 


J093402.02+551427.8 


J09340 1.9+55 1428 


0112520101 


27 


2178 ±56 


0.002 


2.46 


0.2 


154 


J162636.40+350242.0 


iJ162636.5+350242 


0505011201 


14 


127 ± 12 


0.034 


1.44 


4.2 


164 


J080629.80+241955.6 


J080629.7+241956 


0203280201 


6 


156 ±21 


0.042 


3.80 


45.0 


233 


J122254.57+154916.4 


J122254.6+154916 


0106860201 


10 


473 ± 32 


0.005 


2.01 


0.8 


246 


J085735.33+274605.1 


J085735.4+274607 


0210280101 


68 


256 ± 18 


0.007 


2.51 


0.2 


251 


J123520.04+393 109.1 


J123519.9+393110 


0204400101 


26 


97 ±8 


0.021 


1.31 


0.5 



Notes. Left to right: Numeric identifier of the source, SDSS object name where the full name should be 'SDSS . . . ', 2MM object name where 
the full name should be '2XMM J. . . ', XMM-Newt on's observation number, X MM-Newton's exposure time in units of ks, the total counts in 
the EPIC pn monitor, Galactic column density from Dickev & Lockman ( 1990) i n unit s of 10 20 cm -2 , and the hard X- ray luminosity. The las t 
col umn give reference s ab out its classification for so me sources: (a) Foschini et al. (2004), (b) Dewan gan et al] ( l2008h . (c) lPiconcelli et all l[2005), 
(d) lGallo etal]<2006h . (e) lMaitra & Miiieil J2010h . ffl lGrupe et aflfeOlCh . (g) ICrummv et al.l ( l2006ah (among others). 
* These sources could not be analysed owing to the unreliability quality of the data statistics (number of EPIC-pn counts less than 50). 



We used single- and double-pixel events for all observations. 
The background spectra were extracted from nearby circular 
regions free of sources. Spectral response files were generated 
using the SAS tasks rmfgen and arfgen. The epatplot SAS 
task was used to test for the presence of pile up. The EPIC-pn 
X-ray spectra of all but two of the observations (sources 
2XMM J124635.3+022209 and 2XMM J103438.6+393828, 
see Table |2]i were found to be free from the effects of pile-up. 
We performed an X-ray spectral analysis with XSPEC vl2.5 
dArnaudl [T996). taking the limits in accurate calibration of the 
pn data as 0.3 - 12 keV. The source spectra were grouped to 
have at least 20 counts in each bin in order to apply the modified 
X 1 minimization technique; the lowest quality spectra obser- 



vations (100-300 counts) were only grouped with at least 15 
counts per bin. We did not carry out an X-ray spectral analysis 
for the faintest sources (< 100). All quoted errors are for a 
90 per cent confidence interval for one parameter ( Ax 1 = 2.706). 



The X-ray selection criteria resulted in a minimum of 30 
counts at energies above 2 keV in at least one detector inde- 
pendently of the quality of the spectrum, which means that the 
S/N was sometimes low. Thus, the level of detail of our spec- 
tral analysis varied for each source depending on the quality of 
the EPIC-pn spectra, ranging from a quite detailed analysis for 
bright sources, to only very coarse spectral fits for the faintest. 
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3.2. Missing-AGN subsample 

We now study the nature of the missing-AGN subsample, which 
are possibly NLSls. Two objects (2XMMi J123748.5+092323 
and 2XMM J014856.9+135450, see Table simply could 
not be modelled owing to the low number of counts (<50). 
For three other sources (2XMM J141519.4-003021, 2XMM 
J 135724.5 +652506, and 2XMM J 123 126.4+105 1 1 1, see 
Table 13 the energy range used by the spectral analysis 
was E< 4 keV due to the EPIC-pn data being dominated 
by the background above these energies. Finally, 2XMMi 
J082912.8+500652, 2XMM J131718.6+324036, and 2XMM 
J134235.6+261534 were analysed using only EPIC-MOS data 
because of the lack of EPIC-pn data. In Table |2l we give details 
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Fig. 5. Example of a typical NLS1 with a moderate Fe II emis- 
sion (top panel) and, on the other hand, a SDSS spectrum of an 
Fe II-lacking NLS1 (bottom panel). Note in the last spectrum the 
very narrow Balmer emission lines and also the very weak Fe II 
multiplets; the prominent He II broad emission line ensures our 
classification as a type 1 AGN. 



1.5 



□ 



Missing-AGN subsample 
Zhou's NLS1 sample 




Fig. 6. Normalized distribution of the relative strength of the Fe 
II multiplets, /?457o, for the missi ng-AGN subsamp le (filled his- 
togram) and the NLS1 sample of IZhou et al.l (120061) . 



of the X-ray observations and the hard X-ray luminosity of each 
object, which was calculated using the best-fit power-law model 
over the 2-10 keV energy band. We note that the Galactic 
absorption is implicitly included in all the spectral models 
presented hereafter at the values given in Table [3] 

The general best-fit model of the X-ray spectrum emitted 
by a NLS1 has typically four components: an underlying 
absorbed steep power-law, a soft X-ray excess, and a reflection 
component that might also include a broad feature near the 
Fe line complex at 5 - 7 keV. Several explanations have 
been proposed for the origin of the observed soft excess, 
such as a relativ i stically blurred phot oionized disc reflection 
(IRoss et al .] I2002L ICrummv et all [2006b), an intrinsic thermal 
component, or ionized absorption ari si ng in a wind from the 
inner disc (Gierli hski & Don e 2004, 120061) . For simplicity, 
we only used two different two-component continua: a partial 
covering and a thermal model as proxies to each explanation 
respectively (the quality of the X-ray data did not allow a more 
sophisticated analysis in the majority of cases). In the case 
of the first two-component continua, we modelled the soft 
excess emission due to reprocessing of the primary X-rays as 
a partial-covering neutral material (PCF model). This can be 
regarded as a physical model of a clumpy torus, where the torus 
is a smooth continuation of the broad line region, and provide 
a physical explanation of the apparent mismatch between the 
optical classification and the X-ray properties of these objects. 
In this model, X-ray absorption, dust obscuration, and broad 
line emission are produced in a single continuous distribution 
of clouds: the broad line region is located within the dust 
sublimation radius, hence the non-dust-free clouds obscure 
the optical emission but not the X-rays, whereas the torus is 
located outside the dust sublimation ratio. The PCF model 
assumes that some fraction, /, of the X-ray source is covered 
by a neutral absorber with a column density of Nh, z , while the 
rest is unobscured. This could be responsible for an apparent 
soft excess in two different geometries, either by re flection from 
optica lly thick material out of the line of sight (Fabia net al.l 
2002) , or absorption by optically thin material along the line of 
sight dGierlinski & Donell2004t Ichevallier et ai1l2006l) . 

On the other hand, there are some possible ways of ex- 
plaining the soft excess from the disc itself in terms of the 
reprocessing of the primary X-rays in the accretion disc as 
reflected emission from a geometrically flat disc, with solar 
abundances, illuminated by an isotropic source. Thus, the soft 
excess is sometimes closely fitted by a black body that has a 
roughly constant temperature of 0.1 - 0.2 keV. If this radiation 
is thermal, this temperature is much too h igh to be explained 
by the standard accretion disc model of Shak ura & Sunvaevl 
(Il973l) . although it could be explained by a slim accretion disc 
in which the temperature is raised by ph oton trapping, in which 
case the accretion is super- Eddington dTanaka et al.l 120051) . or 
by thhe Comptoniza tion of extreme UV accretion disc photons 
dPorquet et al.ll2004 e.g.). 

We carried out the hard X-ray spectral fitting procedure 

using the following scheme: 

1. We first fitted the individual data in the 2-10 keV (when 
possible) energy range, using a power law modified by ab- 
sorption from cold gas in our Galaxy (hereafter PL). 

2. We then checked for any significant additional component 
that may be present in this energy range, such as an iron 
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emission line (model led with zgauss at E c ~ 6.4 keV) 
and/or an Fe K-edge among other features. 
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Fig. 7. Distribution in r^-io for the "missing AGN" when we 
fitted the individual data in the 2-10 keV energy range with 
a single power law modified by absorption in our Galaxy. The 
vertical line marks the expected value for objects of Seyfert 2 
type. 



We found that the PL yields an acceptable fit for almost all 
objects in the sample, in terms of the minimum^ 2 (see Table[3]). 
No other statistically significant and physically meaningful 
features were found in the hard X-ray spectra. Figure shows 
that the distribution of F2-io takes values above the typical 
expected value (~ 1.8) found for type 2 AGNs. We note that 
the cases where the X-ray spectra appears very hard with a 
power-law slope F ~ 1.5, also correspond to those with larger 
errors, +0.5 (see Table [3). 

We also attempted to fit the spectrum over the entire X-ray band 
(0.3 - 10 keV) with an intrinsically absorbed power-law model 
(absorbed-PL). We found that for all sources this simple model 
was rejected over the whole X-ray band with a high statistical 
significance. The estimated intrinsic column density was found 
to be less than 10 22 cirT 2 , which corresponds to an unabsorbed 
model or at least values of absorption that fall in the low part of 
the column density distribution of type-2 AGNs. These very low 
upper limits are not quoted in Table [3] 

We found additional evidence against starburst activity in 
the missing-AGN population by comparing the soft (0.3 - 2.5 
keV) versus hard (2-10 keV) spectral indices, which revealed 
an overall spectral steepening towards low energies in many 
cases. This suggests that there is a soft X-ray excess that 
contributes mostly below ~ 2 keV . For the majority of the 
sources, this soft excess represents more than 20% of the X-ray 
emission, what is higher than expected for starburst activity. We 
show this soft component in Figure [8] defined as the excess over 
an extrapolation to 0.3 keV of the PL model fitted to the hard 
X-ray band. 

We c allied out the soft X-ray component fitting procedure 

using the following approach and used the probability of the F- 
test, accepting additional spectral components only when they 
improved the fit with a significance > 3cr: 



1 . We added either a redshifted black body component (zbbody 
in XSPEC) to the PL model (hereafter BB-PL model) or a 
neutral absorber at the redshift of the X-ray source that is 
either fully (/ = 1) or partially (/ < 1) covers the source 
(zpcfabs in XSPEC; hereafter PCF-PL model). 

2. We then compared these two modelfl BB-PL and PCF- 
PLi. When one of the two models gave a fit with a A^f 2 = 
X 2 PL - Xbb-pl/pcf-pl -10 and/or an F-test significance that 
was really high, > 99%, the new model was taken as the 
baseline. In the case of sources for which the x 2 for the BB- 
PL and PCF-PL models were comparable and the values of 
each free parameter were physically plausible for both com- 
ponents, we adopted as the best fit model the one with the 
least uncertain model parameters. 

Table|4]shows the best-fit model parameters for each source. 
The addition of either a BB or PCF component provides a good 
match to the observed spectra in almost all objects with more 
than 100 counts, and correspondingly provides a close fit accord- 
ing to the x 1 test, than the absorbed-PL model (i.e. Ay 2 and/or 
P F-test in Table 0J. An additional BB component was required to 
achieve good spectral fits in about one-third of these objects (first 
part of the Table 0); the inferred electron temperatures are found 
to be in the range of 100 - 200 eV, which is slightly higher than 
those found for classic AGNs. Such high temperatures could 
be explained by the presence of a slim accr etion disc in which 
the temperature is raised by p hoton trapping dAbramowicz et all 
1988; iMineshige et"ai] |2000). For another third of the missing- 
AGN subsample (second part of the TableHJ, the best-fit models 
were achieved by the addition of a PCF component. The spectral 
fits for a partial covering model indicate that there were varia- 
tions in both the absorption column density Nh, z = (1 - 6) X 10 21 
cm -2 and covering factor / = 0.6 - 0.9. The measured strength 
of the non-absorbed X-ray primary emission from the neutral 
material is < 1 — / >= 0.32 (> 0.2 for the great majority) being 
slightly higher than those expected by type-2 AGNs (< 0. 1 -0.2). 

For the remaining third of the sources, the low quality of 
their X-ray spectra did not allow us to choose between the 
various possible models. In summary, the X-ray spectra of the 
subsample of missing-AGNs were closely fitted by a rather 
steep power-law, to which a soft excess apparent at energies 
< 2 keV should be added, when data of sufficiently high quality 
becomes available. This is totally in line with the assumption 
that this population is largely dominated by NLSls. 



3.3. True-SF subsample 

In parallel, we conducted an X-ray spectral analysis of the 
true-SF subsample. We were able to perform the spectral 
analysis of 8 of the 38 sources only, because of the poor X-ray 
spectral quality of the remaining 30. 

The X-ray spectra of local SF galaxies in the 0.5 - 10 
keV band can be described by a combination of warm thermal 
emission (with typically kT ~ 0.6 - 0.8 keV) dominating 
at energies < 1 keV, and a power-law spectrum responsible 
for producing the bulk of their 2-10 keV flux. The latter 
component has various interpretations, either in terms of an 
extremely hot (kT > 5 keV) thermal component or a F ~ 2 
power-law model for high mass X-ray binaries. Given that SF 



5 We note that for most of the sources the X-ray data quality is too 
poor to attempt more sophisticated fits. 
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Table 3. Missing AGNs. Results of fitting the spectral data with both a power law in the hard X-ray energy band (2-10 keV) and 
an absorbed power-law model across the whole X-ray spectrum (0.3 - 10 keV). 



ID 


Notes 


T 


^/d.o.f. 


ID 


Notes r 


^/d.o.f. 


Pawer-Law model over 2 


- 10 keV 


Absorbed Power-Law model over 0.3 - 10 keV 


26 




1.50±$£ 


1.020/3 


26 


2 79+ ,) uy 

09 


1.213/63 


42 




2.23±| 


0.680/20 


42 


2 39+ o: ° 5 

05 


0.933/161 


53 


a 


2.75±» : « 


1.306/11 


53 


a 2 57+" :n 


1.154/43 


63 


a 


2.20±jf 


0.829/7 


63 


a 1 91+ li:i ' i 

on 

2 79+n- 02 

' ,y 0.02 


1.500/18 


65 




? 27+ 8 - 14 

12 

2.90±™ 


1.156/54 


65 


1.370/316 


71 


c 


2.080/112 


71 


C 




100 




23S± l\l 


1.182/47 


100 


2.95±°$ 


1.360/301 


104 




2 24+" M 

0.51 


0.581/9 


104 




1.350/170 


111 


b,l 




111 


M 2.72±o| 


0.676/31 


125 


b,2 


1 71+ 041 

061 

2 31+ li: " 

J 0.74 


0.655/6 


125 


fc,2 2.58± : 
2 31+ !i: '2 

11 


0.993/85 


129 




1.270/3 


129 


1.294/39 


161 






161 


3 17+ o:34 

q as 


1.762/8 


171 




2. 19+ 


0.811/3 


171 


8.Q9 


1.437/81 


191 




2,ii±8:S 


1.195/217 


191 


2 67+ !i iil 
^■ u '-o.oi 


1.475/562 


195 




2 10±™ 


1.021/299 


195 


2 10+ 002 


2.021/587 


203 






0.817/4 


203 


2 24+°- 12 

fii? 


0.949/41 


204 


a 


^- UJ -0.66 


0.220/3 


204 


a 2 45+ 035 

2 70+° 14 

Q-14 


0.367/13 


214 




l-64±; 


0.958/4 


214 


1.052/32 


241 




1.28±f 


0.810/19 


241 


2 37+ 018 
2.85±|| 


0.899/102 


275 




2 75+ - 07 
z "' J -o.o7 


0.897/213 


275 


1.657/558 


318 


b, 1 




318 


M 2.77±i 


0.894/30 


329 


c 


1 74+ - 57 

1 - '^-0.90 


1.910/3 


329 


c 




338 




2 20+ - 60 

OH 


0.802/7 


338 




0.979/63 


355 




1 57+ 030 


0.008/1 


355 


BIS 


1.002/13 


357 




1.83±i 


0.556/11 


357 


2 05+° ™ 


1.231/66 



Notes, (a) The hard photon index was obtained in the energy band 1-10 keV due to low statistics; 

(b) The pn data is dominated by the background: (1) above 2-3 keV, (2) above 6 keV; 

(c) The soft excess is far strong (ratio > 30) invalidating the application of PL fit across the whole X-ray spectra. 



spectra often also exhibit strong colli sionally excited emission 
lines, we used a MEKAL model ( iMewe et alJ I1985L 1 19861) 
to fit the thermal component. This component appeared at 
soft X-ray energies (below 1-2 keV), and we added a PL 
component to fit the hard X-ray spectrum. The X-ray spectra 
were modelled by adding both component which resulted in 
good fits for only 3 out of 8 objects (2XMMJ140920.0+262219, 
2XMM J093401. 9+55 1428, and 2XMM J122254.6+154916, 
see Table |2| with kT ~ 0.6 - 3 keV and F ~ 2.2; the resulting 
parameters were consistent with those expected for a SF galaxy. 
The thermal component contributes significantly over the 
0.3 - 10 keV range, supplying ~ 20% of the total flux. Hence, 
a hot gas starburst component was found to be present in the 
spectra of these three objects. The improvement to the fits when 
a MEKAL component was added to a PL in five of the remain- 
ing eight objects, was minimal with Ax 2 < 5 for two additional 
degrees of freedom (see Table |5). Hence, their X-ray spectra 
were modelled by a PL only, which resulted in an acceptable 
fit; the resulting parameters F ~ 1.7-2 were compatible with 
those for a SF galaxy, as expected. We therefore restricted our 
analysis of the remaining 30 sources to an inspection of the 
hardness ratios. In Figure [2] we can see that the hardness ratio 
of all sources covers a wide range of values, and therefore we 
are unable to reach any firm conclusion based on these data. 
However, the invariably low values of the HR for these particular 
sources are consistent with them being dominated by a thermal 



spectrum, in full agreement with the expectation for SF galaxies. 

As a final additional test, we fitted the missing-AGN X-ray 
spectra as if they were true-SF sources. We found that the soft 
X-ray excess is not properly described in terms of a thermal 
(MEKAL) emission in that case. 

3.4. Missing-AGNs versus type-2 AGNs 

We then assessed the different nature of the missing-AGN 
sources in terms of spectral fitting of kno wn type-2 AGNs. To 
avoid composite objects we adopted the iKewlev eF al. (2001) 
criterion to secure optically classified AGNs. As for in the 
missing-AGN population, we only used the sources with a 
minimum of 50 counts in at least one detector. This requirement 
resulted in the selection of 56 AGNs. From inspection of the 
X-ray data, we excluded 15 of these objects that were dominated 
by the background above ~ 2-3 keV. Finally, we removed those 
sources that had previously been classified as LINERS. The 
final sample contains 34 bona-fide type-2 AGN candidates. 

The results of our analysis are as follows: 

1. The spectra of 10 type-2 AGN (~ 29%) are best-fitted with 
a simple power-law. The average F obtained as a function of 
the 2-10 keV flux is softer (< F >= 1.8) than the typical 
values of ~ 1.9 and ~ 2.1 found in the unabsorbed AGN and 
the missing-AGN subsamples, respectively. 
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Table 4. Missing AGNs. Best-fit models for the soft X-ray excess. 



ID 


T KT xlld.o.f. 


Ay 2 P 1 ' 


r n Hz / 


^/d.o.f. 


Ay 2 P T 

"X F-test 




BB = PHA * ( zPOW + zBB ) 




PCF = PHA * zPOW * zPCF 







BB-PL as the best-fit model 



65 


2 52+"- U4 

Q-Q3 


137±| 


0.987/314 


91.4 


~0 


2.8LC 


49+- 


0.7= 


h U.2 
c 0.1 


1.171/314 


33.6 


~0 


71" 


2 50+™ 
265+^ 


160±!» 


1.140/110 


108 


(*) 










100 


160+1° 


1.100/299 


81.5 


~0 


2.98±°;| 


34+|? 


0.6= 


,0.1 

c 0.1 


1.230/299 


41.5 


~0 


104 




H9±t 


1.026/168 


100.4 


~0 


3 50+ 04 


21 + 10 


0.8= 


h 0.1 
c 0.2 


1.537/168 


18.9 


~0 


195" 


2.30±;| 


143+ 3 ? 


1.140/210 


68 


~0 










241 


1 79+ - 23 

tiy 0.21 


I12±f 


0.764/100 


15.3 


~0 


2 62+ 038 




0.7= 


c 0.2 


0.810/100 


10.7 


~0 


275 


2 51+ - 02 


116+ 2 


1.082/556 


323.0 


~0 


2 91+ 002 


15+4 


0.57: 


.0.03 
E 0.04 


1.171/556 


273.5 


~0 


329" 


104±5 


0.844/95 


214.9 


(*) 













PCF-PL as the best-fit model 



26 


247± o.ii 
2 22+™ 


110+- 


1.169/61 


5.1 


0.12 




34± 


°- 8± Q 


1.072/61 


11.1 


~0 


42 


155+ 22 


0.889/159 


8.9 


~0 


2.4i ±:; 


65+ 


0.6+; 
0.6+; 
0.6+; 


0.923/159 


29.8 


0.157 


125 


2 2 f 4 - + w 


102± 2 ^ 


0.936/84 


6.5 


~0 


2 65+° 09 


0.880/84 


10.5 


~0 


171 


3.04+H 


> 2 • 10 3 


0.901/79 


12.8 


~0 


3 04+° 12 

Q-QS 


n4 

23±| 


0.874/79 


14.9 


~0 


191 


106+} 


1.178/560 


168.4 


~0 


2.73+™ 


0.6±»;| 


1.041/560 


245 


~0 


214 


1 98+ 020 


144+; 3 


0.810/30 


9.3 


~0 


2.814S 


43±£ 


o-9±!J:2 


0.681/30 


13.2 


~0 



BB and/or PCF extra component could not be excleded 



53 


2.41±o J6 
? 03+^ 

z 0.22 




1.070/41 


5.7 


0.08 


2.01±g:| 


0.05±»f b 


o.9±;; 


1.130/41 


< 4 


0.25 


63 


> 2 • 10 3 


1.553/16 


< 3 


0.51 


32+;; 


o.7±;; 


1.522/16 


< 3 


0.44 


129 


2 12+ - 08 

0.16 


87±^ 


1.195/37 


6.3 


0.08 


2 67+ !i - 24 


2+ 157 


5+ 02 

" ,J -0.3 


1.256/37 


<5 


0.22 


203 


2.16+ ; 


< 1 


0.892/39 


< 5 


0.12 


234+^ 
2-86±g 


30+;; 


o.7±;; 


0.935/39 


< 5 


0.09 


204 


2-03±| 


120+:; 


0.277/11 


< 5 


0.08 


4+;; 


o.7±;; 


0.243/11 


<5 


0.01 


338" 


2.28±;J 


197+17 


0.985/61 


< 5 


0.45 












357" 


2.22+JJ- 


1710+™ 


1.198/64 


< 5 


0.16 















Notes. The table is divided into three parts: an additional black-body component was required to find the best-fit model, the addition of partial 
covering and, finally cases in which the low did not allow us to reject any of the components. KT is given in units of eV and the intrinsic equivalent 
hydrogen column in units of 10 20 atoms cirT 2 . ** denotes a parameter that XSPEC could not calculated, with an error bar as preceise as 90 per cent 
confidence. 

t If the probability of the F-test is low (close to zero) then it is reasonable to add the extra model component. 

(a) The fit is insensitive to either the N H z or / parameters of the PCF model. 

(b) The X-ray spectra could not fitted by the PCF model. 

(*) The soft excess is extremely strong, invalidating the use of an abosrbed PL fitting over the whole X-ray spectra. 

We could not analyse the X-ray spectra of the sources 15, 163, 324, 335, and 355 because of the small number of counts. Furthermore, the sources 
111, 161, 189, 198, and 302 were not analysed using any of these models, because their hard X-ray spectra is dominated by background, reducing 
the detectability of their source counts. 

Table 5. Star-forming galaxies. Best fit models for the star forming galaxies. 



Source 


Model 


r 


KT 


x h 






008 


A+B 


2 16+"- M 


0.901+;;- 28 


8.80/8 


> 0.97 


~ 13 


079 


A+B* 


2 35+ - 16 


3 71+ - 88 

' x -3.14 


103.25/98 


~ 1 


>200 


233 


A+B 


2.11+^ 


62+ 05 


104.68/90 


~ 1 


> 100 


056 


A 


1.844! 




21.39/19 






149 


A 


1.83±| 
1.94±J| 




4.76/5 


0.0927 


< 6 


164 


A 




17.06/14 


0.2165 


< 1 


246 


A 


1.88+4 




13.75/22 






251 


A 


1.40±i 




6.35/5 


0.63 


< 4 



Notes. A:power law, B: mekal * Galactic neutral hydrogen column density (in units of 10 20 atoms cm 2 , from lDickev & Lockmanl dl990h . * 
Intrinsec absoption with a #/,.,„, = 2. 1+JJ 2 in units of 10 21 atoms cirT 2 

2. The spectra of 9 AGNs (~ 26%) could be most closely fitted 
by the addition of intrinsic absorption at the level of a few 
xlO 22 cirT 2 and the average F is also < F >= 1.8. 

3. The addition of another PL component as a proxy for a 
partially covered absorber or scattering of the AGN light, 
was required to achieve good spectral fits in another 14 
AGN (~ 41%). We found that the average photon index 



is < F >= 2.2 and the average intrinsic column denssity 
< Nh, z >- 5 x 10 23 cirT 2 . The measured amount of X-ray ab- 
sorption for the missing-AGN subsample are lower (by two 
orders of magnitude), falling in the low part of the column 
density distribution of type-2 AGNs. The average measured 
strength of the non-absorbed X-ray primary emission by the 
neutral material was measured to be softer (~ 8%) than the 
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average of the Missing-AGN subsample. Soft excess emis- 
sion is not detected with an F-test significance > 99% in the 
vast majority of the objects. 
4. Finally, we found a significant X-ray soft excess in only 1 
AGN (~ 3%), which is clearly negligible compared with the 
strong soft excess displayed by the missing-AGN subsample. 

The conclusion is therefore that the X-ray properties of the 
missing-AGN population, which we interpret as being domi- 
nated by NLSls, clearly differ from those of the type-2 AGNs 
in our sample. 

4. Discussion 

We investigated the nature of galaxies that are optically classi- 
fied as star-forming, but whose X-ray luminosities are in excess 
of 10 42 erg s _1 , hence indicative of an AGN (missing-AGN sub- 
sample) by exploring both the X-ray and optical properties of 
a sample of NELGs. The availability of catalogues at different 
wavebands with wide sky coverage, allowed us to assemble op- 
tical (SDSS-DR7) and X-ray (2XMMi catalogue) information 
for a large sample of NELGs. A total of 1729 NELGs fall in 
the region covered by XMM-Newton observations, with which 
the 2XMMi catalogue was built. Out of these we find 211 X-ray 
detections, and 1518 upper limits. 

For the 211 NELGs detected in X-rays, we compared the 
optical classification based on the BPT diagram with their 
hard X-ray luminosity used as an indicator of AGN activity. 
We found that about 40% of the BPT-AGN subsample exhibit 
low luminosities not exceeding 10 42 erg s _1 (Weak- AGN 
subsample). Using other optical spectral features, such as 
the [OI] and [Oil] emission lines, we find that 13% of these 
Weak-AGN subsample are classical LINERs and 29% are likely 
to be weak-[OI] LINERs. The LINER X-ray spectral energy 
distribution can be interpreted as a combi nation of a soft thermal 
comp onent plus a hard power law (see iGonzalez-Martfn et al.l 
2006). This soft component may arise from circumnuclear star 
formation that could also explain their emission-line ratios. 

However, the most striking result of this work is that 
virtually all sources in the missing-AGN subsample which have 
been classified as BPT-SF according to the BPT diagram and 
yet display clear signs of AGN activity in the X-ray band are 
unquestionably NLS1. To investigate this, we have calculated 
the hard X-ray luminosity, thickness parameter, X-ray-to-optical 
flux ratio and hardness ratio for the whole NELG sample. We 
found that it is not clear that we can distinguish between SF 
and AGN galaxies using a single criterion (Kauf03 or Lx). 
However, the combined use of X/O flux ratio, T and HR allows 
us to distinguish between SF galaxies and AGNs. For our 
sample of NELGs including objects of high L^and low redshift 
(z < 0.4), we found that the distributions of values of both the 
thickness parameter (T) and X-ray-to-optical flux ratio (X/O) 
are bimodal, with the two populations being separated by about 
T ~ 1 and X/O ~ 0.1, respectively. We noted dichotomies in the 
BPT-SF population, i.e. between the missing-AGN subsample 
and True-SF galaxies, and on the other hand, in the BPT-AGN 
subsample, i.e. between the weak-AGN and strong-AGN 
subsamples. 

We performed several tests to determine whether the 
emission coming from missing-AGNs arises as a result of 
star formation processes, or as AGN activity. The dichotomy 
in the BPT-SF population is directly linked to the values of 



Hp FWHM: all the galaxies with high Lx exhibit the broadest 
widths in the H^ line, from ~ 600 to 1200 km/s, whilst 
the remainding sources with Lx< 10 42 erg s , display an 
H^ FWHM < 600 km/s. Indeed, the missing-AGN subsample 
has high values of both X/O and T and low measured hardness 
ratio for each source. So we conclude that these missing-AGNs 
are NLS1 candidates whilst the rest (true-SF population) are 
consistent with being SF galaxies. 

Strong supporting evidence for the NLS1 nature of the 
missing- AGN population comes from a spectral analysis of 
their X-ray emission. The X-ray spectral properties of the 
missing-AGN subsample appear to be quite uniform. These 
spectra are well reproduced by the combination of black body or 
a partial covering absorption on a steep power-law component at 
hard X-ray energies. A complete study of their spectral energy 
distribution is in progress by Castello-Mor et al. (in prep.). 
Furthermore, we have established evidence of a missing-AGN 
population displaying a soft X-ray excess, whenever spectra of 
sufficient S/N is available to model them. The missing-AGN 
population have Hg lines FWHMs larger than 600 km/s, and 
often display strong Fell emission. 

Therefore, we conclude that the population of the missing- 
AGN subsample is entirely constituted of NLSls with very mod- 
erate broad-line velocities (600 km/s < FWHM(Hg) < 1200 
km/s), which have X/O > 0.1, T > 1 and the vast majority have 
strong soft X-ray excess. 
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Fig. 8. The X-ray spaltral fitting and optical data for each object within Missing AGN sample. Objects order on the figure follows 
the Table 2. X-ray band fitting plot (Odd panels): the individual hard (2 - 10 keV) X-ray data has been fitted with a power-law 
model modified by absorption and we have added the soft component defined as the excess over an extrapolation down up to 0.3 
keV of the best power-law model fitting to the hard X-ray band only for those objects where It was made possible by the statistics? 
Even panels: SDSS spectrum plotted as (log F£) in units of 10 17 erg/cm 2 /s/A. Note that, for two sources (163 and 335) we have not 
X-ray spectrum due to the low counts in the X-ray energy band. 
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